26 research outputs found
Supervised Transfer Learning for Product Information Question Answering
Popular e-commerce websites such as Amazon offer community question answering
systems for users to pose product related questions and experienced customers
may provide answers voluntarily. In this paper, we show that the large volume
of existing community question answering data can be beneficial when building a
system for answering questions related to product facts and specifications. Our
experimental results demonstrate that the performance of a model for answering
questions related to products listed in the Home Depot website can be improved
by a large margin via a simple transfer learning technique from an existing
large-scale Amazon community question answering dataset. Transfer learning can
result in an increase of about 10% in accuracy in the experimental setting
where we restrict the size of the data of the target task used for training. As
an application of this work, we integrate the best performing model trained in
this work into a mobile-based shopping assistant and show its usefulness.Comment: 2018 17th IEEE International Conference on Machine Learning and
Application
Modeling Non-Standard Text Classification Tasks
Text classification deals with discovering knowledge in texts and is used for extracting, filtering, or retrieving information in streams and collections. The discovery of knowledge is operationalized by modeling text classification tasks, which is mainly a human-driven engineering process. The outcome of this process, a text classification model, is used to inductively learn a text classification solution from a priori classified examples. The building blocks of modeling text classification tasks cover four aspects: (1) the way examples are represented, (2) the way examples are selected, (3) the way classifiers learn from examples, and (4) the way models are selected.
This thesis proposes methods that improve the prediction quality of text classification solutions for unseen examples, especially for non-standard tasks where standard models do not fit. The original contributions are related to the aforementioned building blocks: (1) Several topic-orthogonal text representations are studied in the context of non-standard tasks and a new representation, namely co-stems, is introduced. (2) A new active learning strategy that goes beyond standard sampling is examined. (3) A new one-class ensemble for improving the effectiveness of one-class classification is proposed. (4) A new model selection framework to cope with subclass distribution shifts that occur in dynamic environments is introduced
Envisioning the Next-Gen Document Reader
People read digital documents on a daily basis to share, exchange, and
understand information in electronic settings. However, current document
readers create a static, isolated reading experience, which does not support
users' goals of gaining more knowledge and performing additional tasks through
document interaction. In this work, we present our vision for the next-gen
document reader that strives to enhance user understanding and create a more
connected, trustworthy information experience. We describe 18 NLP-powered
features to add to existing document readers and propose a novel plug-in
marketplace that allows users to further customize their reading experience, as
demonstrated through 3 exploratory UI prototypes available at
https://github.com/catherinesyeh/nextgen-prototypesComment: Paper accepted at the AAAI 2023 Workshop on Scientific Document
Understandin
OATS: Opinion Aspect Target Sentiment Quadruple Extraction Dataset for Aspect-Based Sentiment Analysis
Aspect-based sentiment Analysis (ABSA) delves into understanding sentiments
specific to distinct elements within textual content. It aims to analyze
user-generated reviews to determine a) the target entity being reviewed, b) the
high-level aspect to which it belongs, c) the sentiment words used to express
the opinion, and d) the sentiment expressed toward the targets and the aspects.
While various benchmark datasets have fostered advancements in ABSA, they often
come with domain limitations and data granularity challenges. Addressing these,
we introduce the OATS dataset, which encompasses three fresh domains and
consists of 20,000 sentence-level quadruples and 13,000 review-level tuples.
Our initiative seeks to bridge specific observed gaps: the recurrent focus on
familiar domains like restaurants and laptops, limited data for intricate
quadruple extraction tasks, and an occasional oversight of the synergy between
sentence and review-level sentiments. Moreover, to elucidate OATS's potential
and shed light on various ABSA subtasks that OATS can solve, we conducted
in-domain and cross-domain experiments, establishing initial baselines. We hope
the OATS dataset augments current resources, paving the way for an encompassing
exploration of ABSA.Comment: Initial submissio
LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding
Instruction tuning unlocks the superior capability of Large Language Models
(LLM) to interact with humans. Furthermore, recent instruction-following
datasets include images as visual inputs, collecting responses for image-based
instructions. However, visual instruction-tuned models cannot comprehend
textual details within images well. This work enhances the current visual
instruction tuning pipeline with text-rich images (e.g., movie posters, book
covers, etc.). Specifically, we first use publicly available OCR tools to
collect results on 422K text-rich images from the LAION dataset. Moreover, we
prompt text-only GPT-4 with recognized texts and image captions to generate 16K
conversations, each containing question-answer pairs for text-rich images. By
combining our collected data with previous multi-modal instruction-following
data, our model, LLaVAR, substantially improves the LLaVA model's capability on
text-based VQA datasets (up to 20% accuracy improvement) while achieving an
accuracy of 91.42% on ScienceQA. The GPT-4-based instruction-following
evaluation also demonstrates the improvement of our model on both natural
images and text-rich images. Through qualitative analysis, LLaVAR shows
promising interaction (e.g., reasoning, writing, and elaboration) skills with
humans based on the latest real-world online content that combines text and
images. We make our code/data/models publicly available at
https://llavar.github.io/.Comment: Preprint. Work in progres